58 research outputs found

    Fast individual ancestry inference from DNA sequence data leveraging allele frequencies for multiple populations.

    Get PDF
    BackgroundEstimation of individual ancestry from genetic data is useful for the analysis of disease association studies, understanding human population history and interpreting personal genomic variation. New, computationally efficient methods are needed for ancestry inference that can effectively utilize existing information about allele frequencies associated with different human populations and can work directly with DNA sequence reads.ResultsWe describe a fast method for estimating the relative contribution of known reference populations to an individual's genetic ancestry. Our method utilizes allele frequencies from the reference populations and individual genotype or sequence data to obtain a maximum likelihood estimate of the global admixture proportions using the BFGS optimization algorithm. It accounts for the uncertainty in genotypes present in sequence data by using genotype likelihoods and does not require individual genotype data from external reference panels. Simulation studies and application of the method to real datasets demonstrate that our method is significantly times faster than previous methods and has comparable accuracy. Using data from the 1000 Genomes project, we show that estimates of the genome-wide average ancestry for admixed individuals are consistent between exome sequence data and whole-genome low-coverage sequence data. Finally, we demonstrate that our method can be used to estimate admixture proportions using pooled sequence data making it a valuable tool for controlling for population stratification in sequencing based association studies that utilize DNA pooling.ConclusionsOur method is an efficient and versatile tool for estimating ancestry from DNA sequence data and is available from https://sites.google.com/site/vibansal/software/iAdmix

    Simulation-based homozygosity mapping with the GAW14 COGA dataset on alcoholism

    Get PDF
    BACKGROUND: We have developed a simulation-based approach to the analysis of shared homozygous chromosomal segments and have applied it to data on allele sharing among alcoholics in a single Collaborative Study on the Genetics of Alcoholism pedigree. Our assessment of sharing involved the use of a single-nucleotide polymorphism (SNP) marker map provided by Affymetrix. RESULTS: All 11 affected individuals in the selected pedigree shared 2 copies of an allele at 4 adjacent SNPs in a region on chromosome 5. Via simulation, we determined that the probability that such sharing is caused by mere chance is less than 0.0000001. After correcting for undocumented inbreeding, this probability rose to 0.0016. The probability that the shared segment emanates from a single ancestor and is unrelated to the affection status is less than 0.0000001 in the corrected pedigree. Haplotype association analysis and a search for a protective locus using unaffected individuals yielded no significant results. CONCLUSION: Homozygosity mapping results on chromosome 5 provide suggestive evidence of the region's role as one that may harbor a genetic determinant of alcoholism. Furthermore, the probabilities of chance homozygous allele sharing for the original and for the inbreeding-corrected pedigree provide insight into the impact that inbreeding can have on such calculations

    Generalized Analysis of Molecular Variance

    Get PDF
    Many studies in the fields of genetic epidemiology and applied population genetics are predicated on, or require, an assessment of the genetic background diversity of the individuals chosen for study. A number of strategies have been developed for assessing genetic background diversity. These strategies typically focus on genotype data collected on the individuals in the study, based on a panel of DNA markers. However, many of these strategies are either rooted in cluster analysis techniques, and hence suffer from problems inherent to the assignment of the biological and statistical meaning to resulting clusters, or have formulations that do not permit easy and intuitive extensions. We describe a very general approach to the problem of assessing genetic background diversity that extends the analysis of molecular variance (AMOVA) strategy introduced by Excoffier and colleagues some time ago. As in the original AMOVA strategy, the proposed approach, termed generalized AMOVA (GAMOVA), requires a genetic similarity matrix constructed from the allelic profiles of individuals under study and/or allele frequency summaries of the populations from which the individuals have been sampled. The proposed strategy can be used to either estimate the fraction of genetic variation explained by grouping factors such as country of origin, race, or ethnicity, or to quantify the strength of the relationship of the observed genetic background variation to quantitative measures collected on the subjects, such as blood pressure levels or anthropometric measures. Since the formulation of our test statistic is rooted in multivariate linear models, sets of variables can be related to genetic background in multiple regression-like contexts. GAMOVA can also be used to complement graphical representations of genetic diversity such as tree diagrams (dendrograms) or heatmaps. We examine features, advantages, and power of the proposed procedure and showcase its flexibility by using it to analyze a wide variety of published data sets, including data from the Human Genome Diversity Project, classical anthropometry data collected by Howells, and the International HapMap Project

    DNA variation and brain region-specific expression profiles exhibit different relationships between inbred mouse strains: implications for eQTL mapping studies

    Get PDF
    BACKGROUND: Expression quantitative trait locus (eQTL) mapping is used to find loci that are responsible for the transcriptional activity of a particular gene. In recent eQTL studies, expression profiles were derived from either homogenized whole brain or collections of large brain regions. However, the brain is a very heterogeneous organ, and expression profiles of different brain regions vary significantly. Because of the importance and potential power of eQTL studies in identifying regulatory networks, we analyzed gene expression patterns in different brain regions from multiple inbred mouse strains and investigated the implications for the design and analysis of eQTL studies. RESULTS: Gene expression profiles of five brain regions in six inbred mouse strains were studied. Few genes exhibited a significant strain-specific expression pattern, whereas a large number of genes exhibited brain region-specific patterns. We constructed phylogenetic trees based on the expression relationships between the strains and compared them with a DNA-level relationship tree. The trees based on the expression of strain-specific genes were constant across brain regions and mirrored DNA-level variation. However, the trees based on region-specific genes exhibited a different set of strain relationships, depending on the brain region. An eQTL analysis showed enrichment of cis-acting regulators among strain-specific genes, whereas brain region-specific genes appear to be mainly regulated by trans-acting elements. CONCLUSION: Our results suggest that many regulatory networks are highly brain region specific and indicate the importance of conducting eQTL mapping studies using data from brain regions or tissues that are physiologically and phenotypically relevant to the trait of interest

    Correlation Analysis of Genetic Admixture and Social Identification with Body Mass Index in a Native American Community

    Get PDF
    OBJECTIVES: Body mass index (BMI) is a well-known measure of obesity with a multitude of genetic and non-genetic determinants. Identifying the underlying factors associated with BMI is difficult because of its multifactorial etiology that varies as a function of geoethnic background and socioeconomic setting. Thus, we pursued a study exploring the influence of the degree of Native American admixture on BMI (as well as weight and height individually) in a community sample of Native Americans (n = 846) while accommodating a variety of socioeconomic and cultural factors. METHODS: Participants' degree of Native American (NA) ancestry was estimated using a genome-wide panel of markers. The participants also completed an extensive survey of cultural and social identity measures: the Indian Culture Scale (ICS) and the Orthogonal Cultural Identification Scale (OCIS). Multiple linear regression was used to examine the relation between these measures and BMI. RESULTS: Our results suggest that BMI is correlated positively with the proportion of NA ancestry. Age was also significantly associated with BMI, while gender and socioeconomic measures (education and income) were not. For the two cultural identity measures, the ICS showed a positive correlation with BMI, while OCIS was not associated with BMI. CONCLUSIONS: Taken together, these results suggest that genetic and cultural environmental factors, rather than socioeconomic factors, account for a substantial proportion of variation in BMI in this population. Further, significant correlations between degree of NA ancestry and BMI suggest that admixture mapping may be appropriate to identify loci associated with BMI in this population

    Association and ancestry analysis of sequence variants in ADH and ALDH using alcohol-related phenotypes in a Native American community sample

    Get PDF
    Higher rates of alcohol use and other drug-dependence have been observed in some Native American populations relative to other ethnic groups in the U.S. Previous studies have shown that alcohol dehydrogenase (ADH) genes and aldehyde dehydrogenase (ALDH) genes may affect the risk of development of alcohol dependence, and that polymorphisms within these genes may differentially affect risk for the disorder depending on the ethnic group evaluated. We evaluated variations in the ADH and ALDH genes in a large study investigating risk factors for substance use in a Native American population. We assessed ancestry admixture and tested for associations between alcohol-related phenotypes in the genomic regions around the ADH1-7 and ALDH2 and ALDH1A1 genes. Seventy-two (72) ADH variants showed significant evidence of association with a severity level of alcohol drinking-related dependence symptoms phenotype. These significant variants spanned across the entire 7 ADH gene cluster regions. Two significant associations, one in ADH and one in ALDH2, were observed with alcohol dependence diagnosis. Seventeen (17) variants showed significant association with the largest number of alcohol drinks ingested during any 24-hour period. Variants in or near ADH7 were significantly negatively associated with alcohol-related phenotypes, suggesting a potential protective effect of this gene. In addition, our results suggested that a higher degree of Native American ancestry is associated with higher frequencies of potential risk variants and lower frequencies of potential protective variants for alcohol dependence phenotypes

    Comprehensive linkage and linkage heterogeneity analysis of 4344 sibling pairs affected with hypertension from the Family Blood Pressure Program

    Full text link
    Linkage analyses of complex, multifactorial traits and diseases, such as essential hypertension, have been difficult to interpret and reconcile. Many published studies provide evidence suggesting that different genes and genomic regions influence hypertension, but knowing which of these studies reflect true positive results is challenging. The reasons for this include the diversity of analytical methods used across these studies, the different samples and sample sizes in each study, and the complicated biological underpinnings of hypertension. We have undertaken a comprehensive linkage analysis of 371 autosomal microsatellite markers genotyped on 4,334 sibling pairs affected with hypertension from five ethnic groups sampled from 13 different field centers associated with the Family Blood Pressure Program (FBPP). We used a single analytical technique known to be robust to interpretive problems associated with a lack of completely informative markers to assess evidence for linkage to hypertension both within and across the ethnic groups and field centers. We find evidence for linkage to a number of genomic regions, with the most compelling evidence from analyses that combine data across field center and ethnic groups (e.g., chromosomes 2 and 9). We also pursued linkage analyses that accommodate locus heterogeneity, which is known to plague the identification of disease susceptibility loci in linkage studies of complex diseases. We find evidence for linkage heterogeneity on chromosomes 2 and 17. Ultimately our results suggest that evidence for linkage heterogeneity can only be detected with large sample sizes, such as the FBPP, which is consistent with theoretical sample size calculations. Genet. Epidemiol . 2007. © 2007 Wiley-Liss, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/56011/1/20202_ftp.pd

    Long-term influence of normal variation in neonatal characteristics on human brain development

    Get PDF
    It is now recognized that a number of cognitive, behavioral, and mental health outcomes across the lifespan can be traced to fetal development. Although the direct mediation is unknown, the substantial variance in fetal growth, most commonly indexed by birth weight, may affect lifespan brain development. We investigated effects of normal variance in birth weight on MRI-derived measures of brain development in 628 healthy children, adolescents, and young adults in the large-scale multicenter Pediatric Imaging, Neurocognition, and Genetics study. This heterogeneous sample was recruited through geographically dispersed sites in the United States. The influence of birth weight on cortical thickness, surface area, and striatal and total brain volumes was investigated, controlling for variance in age, sex, household income, and genetic ancestry factors. Birth weight was found to exert robust positive effects on regional cortical surface area in multiple regions as well as total brain and caudate volumes. These effects were continuous across birth weight ranges and ages and were not confined to subsets of the sample. The findings show that (i) aspects of later child and adolescent brain development are influenced at birth and (ii) relatively small differences in birth weight across groups and conditions typically compared in neuropsychiatric research (e.g., Attention Deficit Hyperactivity Disorder, schizophrenia, and personality disorders) may influence group differences observed in brain parameters of interest at a later stage in life. These findings should serve to increase our attention to early influences
    corecore